Incorporating multiple-HMM acoustic modeling in a modular large vocabulary speech recognition system in telephone environment

نویسندگان

  • Ascensión Gallardo-Antolín
  • Javier Ferreiros
  • Javier Macías Guarasa
  • Ricardo de Córdoba
  • José Manuel Pardo
چکیده

The use of multiple acoustic models has reported great improvements when facing speaker independent difficult tasks. In this paper, we are applying this strategy to a flexible, large vocabulary, speaker-independent, isolated-word hypothesis generation system in a telephone environment with vocabularies up to 10000 words. The new problem addressed here is how to efficiently integrate the multiple model scheme in the system, as due to its bottom-up approach (phonetic string generation followed by a lexical access process), multiple possibilities arise (apart from the alternatives in the training stage), and its not clear what combination would achieve the best results. In the paper, full details on every alternative are shown, along with results showing actual improvements in the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modular combination of deep neural networks for acoustic modeling

In this work, we propose a modular combination of two popular applications of neural networks to large-vocabulary continuous speech recognition. First, a deep neural network is trained to extract bottleneck features from frames of mel scale filterbank coefficients. In a similar way as is usually done for GMM/HMM systems, this network is then applied as a nonlinear discriminative feature-space t...

متن کامل

Implicit Trajectory Modeling through Gaussian Transition Models for Speech Recognition

It is well known that frame independence assumption is a fundamental limitation of current HMM based speech recognition systems. By treating each speech frame independently, HMMs fail to capture trajectory information in the acoustic signal. This paper introduces Gaussian Transition Models (GTM) to model trajectories implicitly. Comparing to alternative approaches, such as segment modeling and ...

متن کامل

Hidden Markov models for trajectory modeling

Current state-of-the-art statistical speech recognition systems use hidden Markov models (HMM) for modeling the speech signal. However, it is well known that HMM's do not exploit the time-dependence in the speech process, since they are limited by the assumption of conditional independence of observations given the state sequence. Alternative techniques, such as segment modeling approaches, can...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition

We investigate several variants of speech-rate-dependent acoustic models for large-vocabulary conversational speech recognition, in the framework of combining rate-specific models in decoding to compensate for speech rate variation. We study two basic approaches to combining rate-specific models: one combines models at the pronunciation level and the other at the HMM state level. Furthermore, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000